System and Method for Restricting Access through a Mantrap Portal

ABSTRACT

A method and system provides increased levels of security for a mantrap portal by continuously monitoring two zones of the mantrap; a primary zone and a secondary zone. A primary sensor determines that exactly one or zero people are present in the primary zone when requesting access into a secured area. A secondary sensor determines that exactly zero people are present in the secondary zone when access to the secured area is granted. The primary and secondary sensors in combination can detect piggyback events and tailgating events before granting access to a secured area. Further, the primary and secondary sensors in combination can detect the presence of unauthorized persons in a mantrap prior to granting access to the mantrap for exit from a secured area.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation-in-part to U.S. patent application Ser. No. 10/702,059, entitled “Method and System for Enhanced Portal Security through Stereoscopy,” filed Nov. 5, 2003, the contents of which are hereby incorporated by reference.

BACKGROUND OF THE INVENTION

This invention relates to security systems that permit controlled access to a secured area. Specifically, this invention relates to automatic door control in a secured area using a mantrap portal.

A typical security issue associated with most access controlled portal security systems is to ensure that when a person is authorized for entry into a secured area, it is only that person that is permitted to enter. A mantrap secured portal is a configuration of a secured portal that is commonly employed for restricting access to that of only an authorized person at a time.

A mantrap portal is a small room with two doors: one door for access to/from an unsecured area (called “landside”); and one door for access to/from a secured area (called “airside”). The basic operation of a mantrap for entry into a secured area from an unsecured area can be described with reference to FIG. 1.

A representative mantrap 100 is shown to provide an access portal between an unsecured landside region 130 and a secured airside region 140. The mantrap 100 has a landside door 120 and an airside door 110. The landside door 120 can be locked in the closed position with landside lock 150, and the airside door 110 can be locked in the closed position with airside lock 160. In the normal, unoccupied configuration (not shown), the airside door 110 is closed and locked with airside lock 160, while the landside door 120 is closed, but not locked by landside lock 150.

A person seeking access to the secured area (airside) will approach the mantrap 100 represented by person 125. The landside door can be opened while the airside door is locked. Once the person seeking access is fully inside the mantrap, as represented by person 105, a request for entry can be made at entry request 155. The entry request can be a card reader, a doorbell, or a biometric input such as a palm or fingerprint reader, or a retina scan. Once entry access is granted, the landside door 120 is in the closed position and locked by landside lock 150. With the landside door 120 locked closed, the airside lock 160 can be released, and the airside door 110 can be opened. The person seeking access can enter the secured area, represented as person 115. Once the airside door 110 is closed and locked by airside lock 160, the landside lock 150 can be released, to return the mantrap to the normal, unoccupied position.

The mantrap 100 can operate to permit a person to exit the secured airside region 140 while maintaining a high degree of security. A request can be made at exit request 165, which starts the door locking cycle. The landside door 120 is locked by landside lock 150, and the airside door 110 is unlocked by airside lock 160. The person seeking to exit can enter the mantrap, and the airside door 110 can be locked so that the landside door 120 can be unlocked, thereby permitting a person to exit. The mantrap configuration operates to control access since the door to the unsecured area can be locked in the closed position while the door to the secured area is open.

The basic operation of a mantrap portal becomes increasingly complex as security of the portal is enhanced. For example, mantrap portals are commonly equipped with IR sensors, pressure mats, to prevent piggyback and tailgate violations.

Piggybacking can occur when an authorized person knowingly or unknowingly provides access through a portal to another traveling in the same direction. If a second, unauthorized person is permitted to enter the secured area with the authorized person, the security is breached.

Tailgating can occur when an authorized person knowingly or unknowingly provides unauthorized access through a portal to another traveling in the opposite direction. For example, an unauthorized person entering the mantrap from the unsecured area can wait until someone leaves the secured area - and while the door is opened into the mantrap from the secured area, the unauthorized person can enter, thereby breaching security.

Piggybacking and tailgating can be prevented in a mantrap using door locks controlled by a door controller that has the ability to count the number of people in the mantrap. To prevent piggybacking violations, the door to the secured area is only unlocked if there is exactly one authorized person seeking access to the secured area. Tailgating is prevented by only unlocking the door from the secured area to permit someone to exit the secured area only if there is nobody detected in the mantrap.

Mantrap portals with enhanced security, such as pressure mats and IR sensors, are easily defeated by two people walking close together, or by carrying one person by the other. Accordingly, there exists a need for a system that can effectively enhance the security of a mantrap portal.

BRIEF SUMMARY OF THE INVENTION

The present invention provides for improved methods and systems for restricting access to a secured area using a mantrap portal. An embodiment of the present invention continuously monitors a primary zone to determine the presence or absence of one person in the primary zone. The primary zone is a region of the mantrap having an area less than the area of the entire mantrap, preferably located at a location proximal to the airside door. While the primary zone is monitored, the present invention continuously monitors a secondary zone to determine that no persons are present. The secondary zone is a region of the mantrap not including the primary zone. When the primary zone has exactly one or zero people present, and at the same time the secondary zone has exactly zero people present, the mantrap door locking/unlocking cycle can commence to permit access/egress to/from the secured area.

An exemplary embodiment of the present invention uses a three-dimensional machine vision sensor to monitor the primary zone and the secondary zone to identify and track detected features that can be associated with people or a person. When used in conjunction with a door access control system, alarm conditions can be generated when unexpected conditions are detected.

Other embodiments of the present invention use a three-dimensional machine vision sensor to monitor the primary zone in combination with one or more presence/absence detectors to monitor the secondary zone.

Further embodiments disclose methods and systems that perform additional two-dimensional image analysis of regions of the mantrap in combination with a three-dimensional image analysis so that the extreme extents of the respective primary and secondary zones, and regions not captured by the respective primary and secondary zones are analyzed for the presence of any people or objects.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

The present invention is further described in the detailed description which follows, by reference to the noted drawings by way of non-limiting exemplary embodiments, in which like reference numerals represent similar parts throughout the several views of the drawings, and wherein:

FIG. 1 is a plan view of a mantrap security portal according to the background art;

FIG. 2 is a plan view of a mantrap security portal according to the present invention;

FIG. 3 is a block diagram of a control system according to the present invention;

FIG. 4 is a flowchart of the operation of the mantrap security portal according to the present invention;

FIG. 5 is a perspective view of an embodiment of the present invention;

FIG. 6 is a flowchart of the method used to detect people or objects according to the exemplary embodiment of the present invention;

FIG. 7 is a flowchart of the additional image analysis methods used to detect people or objects according to an alternate embodiment of the present invention;

FIG. 8 is a plan view of a mantrap security portal according to an exemplary embodiment of the present invention.

FIG. 9 is a block diagram illustrating a coarse segmentation process that identifies coarse people candidates according to an embodiment of the present invention;

FIG. 10 is a diagram illustrating the coarse segmentation process that identifies coarse people candidates according to an embodiment of the present invention;

FIG. 11 is a block diagram illustrating a fine segmentation process for validating or discarding coarse people candidates according to an embodiment of the present invention;

FIG. 12 is a diagram illustrating a fine segmentation process for validating or discarding coarse people candidates according to an embodiment of the present invention;

FIG. 13 is a diagram illustrating a fine segmentation process for validating or discarding coarse people candidates according to an embodiment of the present invention; and

FIG. 14 is a block diagram illustrating a method used to determine the number of people candidates by confidence level scoring according to an embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Referring to FIG. 2, in accordance with the present invention, there is provided a mantrap 100 to permit an enhanced level of security. The mantrap 100 is a portal region between an insecure landside region 130 and a secured airside region 140. The mantrap 100 has a landside door 120 for access into and out from the landside region 130, and an airside door 110, for access into and out from the airside region 140. An airside door lock 160 permits remote locking of the airside door 160, and a landside door lock 150 permits remote locking of the landside door 120. An access request 155 is shown as a panel for requesting access into the secure airside region 140, and an exit access 165 is shown as a panel for requesting access from the secured airside region 140 into the mantrap 100.

As shown in FIG. 2, a primary zone 210 is established as a region in the mantrap having an area less than the area of the mantrap 100. A primary sensor 230 monitors the primary zone 210 to determine if exactly one person is present in the primary zone. The primary zone 210 can be located anywhere within the mantrap 100, though preferably, the primary zone 210 is located adjacent to the airside door 110.

A secondary zone 220 is established as a region within the mantrap 100, not including the primary zone 210. The secondary zone 220 does not need to include the entire region of the mantrap 100 exclusive of the primary zone 210, though it is preferred that any region within the mantrap not inclusive of the primary zone 210 and the secondary zone 220 be not large enough for a person to occupy.

A secondary sensor 240 monitors the secondary zone to determine whether or not a person or object exists within the secondary zone 220.

Referring to FIG. 3, a block diagram is shown in accordance with the present invention. A controller 310 of the type conventionally known in the art of access control for security applications is used to control the airside door lock 160 and the landside door lock 150. The controller can be any device that is capable of reading inputs, processing simple logic, and controlling the landside door and airside door. The controller may have the capability for performing automatic door control, i.e., opening and closing, in addition to actuation of the respective door locks. The controller 310 can be a Programmable Logic Controller (PLC), or a Personal Computer (PC) with the appropriate software instructions.

The controller 310 is responsive to signals from an entry request 155 and an exit request 165 upon presentation of an appropriate credential by the person seeking access/exit. Each of the entry request 155 and exit request 165 being of the type conventionally known in the art of access control for security, including, but not limited to, card readers, keypad terminals, or biometric input stations, such as finger- or palm-print readers, retinal scanners, or voice recognition stations.

The controller 310 is adapted to receive input signals from the primary sensor 230 and the secondary sensor 240 to actuate the airside door lock 160 and the landside door lock 150 in response to either of the entry request 155 or exit request 165 terminals. FIG. 4 depicts a flowchart of the basic operation of the controller 310 according to the present invention.

Referring to FIG. 4, the controller initializes the mantrap 100 with the appropriate signals to lock the airside door at step 410, and unlock the landside door 420. The entry request terminal 155 is monitored at step 430 and the exit request terminal 165 is monitored at step 440. If neither an entry request 430 or exit request 440 is made, processing loops continuously.

Referring to FIG. 2 in conjunction with FIG. 4, a person seeking access to the secured airside region approaches the mantrap, shown as person 125. Once an entry request is made, processing continues to step 450 where the output of the primary sensor 230 and secondary sensor are considered by the controller 310. If the primary sensor does not output a signal indicating that one person is in the primary zone, or if the secondary sensor does not output a signal indicating that no objects or people are detected in the secondary zone, processing continues by looping in place, as shown by processing path 455, until both conditions are met.

When the person seeking access is in the primary zone 210, shown in FIG. 2 as person 105, the primary sensor outputs a signal indicating that one person is detected in the primary zone. If there are no people or objects detected in the secondary zone 220, the secondary sensor outputs a signal indicating that no such people or objects are detected. and processing continues. At this point, the landside door is locked at step 470 and the airside door is unlocked at step 480, so that the person seeking access can enter the secured airside region, shown as person 115.

Processing continues by looping back to step 410 where the airside door is returned to the locked state, and the landside door is unlocked at step 420.

If an exit request is detected at step 440, processing continues to step 460, where the signals from the primary sensor 230 and secondary sensor 240 are considered by the controller 310. At step 460, if the primary sensor does not indicate that no people are present in the primary zone 210 or if the secondary sensor does not indicate that no people or objects are present in the secondary zone 220, processing continues by looping in place, as shown by processing path 465.

When both the primary sensor detects that zero people are present in the primary zone 210, and the secondary sensor detects that no people or objects are present in the secondary zone, processing continues to step 470 where the landside door is locked. When the airside door is unlocked at step 480, the person requesting to exit from the secured airside region can enter the mantrap through the airside door 110. At that point, the airside door can be locked at step 410 and the landside door can be unlocked at 420, so that the person can exit the mantrap through the landside door 120.

One skilled in the art of controlling access to a secured area using a conventional door control system will appreciate that the basic operation of the mantrap 100 can be modified in various ways without departing from the scope and spirit of the present invention. For example, the entry request terminal 155 can be placed outside the mantrap in the unsecured landside region 130, and the normal idle state of the mantrap can be configured with both the airside door 110 and the landside door 120 in the locked state. Further, several alarm conditions can be initiated by the controller 310 if the looping path 455 or the looping path 465 are traversed for a specified duration.

In an exemplary embodiment of the present invention, the primary sensor 230 and the secondary sensor 240 are each a three-dimensional machine vision sensor described herein with reference to FIG. 5. Each of the primary sensor 230 and the secondary sensor 240 has a 3D image processor, memory, discrete I/O, and a set of stereo cameras 10, in an integrated unit mounted in the mantrap 100. The primary sensor 230 is mounted in the ceiling above the airside door 110 looking downward and outward towards the primary zone 210. The secondary sensor 240 is mounted in a position so that it can observe the secondary zone 220. One skilled in the art will appreciate that the primary sensor 230 and the secondary sensor 240 can be mounted in any number of positions relative to the respective primary and secondary zones.

In each of the sensors, the set of cameras 10 is calibrated to provide heights above the ground plane for any point in the field of view. Therefore, when any object enters the field of view, it generates interest points called “features”,the heights of which are measured relative to the ground plane. These points are then clustered in 3D space to provide “objects”. These objects are then tracked in multiple frames to provide “trajectories”.

In an exemplary system, the baseline distance between the optical centers of the cameras is 12 mm and the lenses have a focal length of 2.1 mm (150 degree Horizontal Field of View (HFOV)). The cameras are mounted approximately 2.2 meters from the ground and have a viewing area that is approximately 2.5 by 2.5 meters. The surface normal to the plane of the cameras points downward and outward as shown in FIG. 5 wherein the cameras are angled just enough to view the area just below the mounting point.

In the exemplary embodiment of the present invention various parameters are set up in the factory. The factory setup involves calibration and the computation of the intrinsic parameters for the cameras and the relative orientation between the cameras. Calibration involves the solution of several sub-problems, as discussed hereinafter, each of which has several solutions that are well understood by persons having ordinary skill in the art. Further, rectification coefficients, described hereinafter, must be computed to enable run time image correction.

Stereo measurements could be made in a coordinate system that is different from the coordinate systems of either camera. For example, the scene or world coordinates correspond to the points in a viewed scene. Camera coordinates (left and right) correspond to the viewer-centered representation of scene points. Undistorted image coordinates correspond to scene points projected onto the image plane. Distorted image coordinates correspond to points having undergone lens distortion. Pixel coordinates correspond to the grid of image samples in the image array.

In the exemplary embodiment one camera is designated to be a “reference camera”,to which the stereo coordinate system is tied. An interior orientation process is performed to determine the internal geometry of a camera. These parameters, also called the intrinsic parameters, include the following: effective focal length, also called the camera constant; location of the principal point, also called the image center; radial distortion coefficients; and horizontal scale factor, also called the aspect ratio. The cameras used in the exemplary embodiment have fixed-focus lenses that cannot be modified; therefore these parameters can be computed and preset at the factory.

A relative orientation process is also performed to determine the relative position and orientation between two cameras from projections of calibration points in the scene. Again, the cameras are mechanically fixtured such that they stay in alignment and hence these parameters can also be preset at the factory.

A rectification process, closely associated with the relative orientation, is also performed. Rectification is the process of resampling stereo images so that epipolar lines correspond to image rows. “An epipolar line on one stereo image corresponding to a given point in another stereo image is the perspective projection on the first stereo image of the three-dimensional ray that is the inverse perspective projection of the given point from the other stereo image, as described in Robert M. Haralick & Linda G. Shapiro, Computer and Robot Vision Vol. II 598 (1993), incorporated herein by reference. If the left and right images are coplanar and the horizontal axes is collinear (no rotation about the optical axis), then the image rows are epipolar lines and stereo correspondences can be found along corresponding rows. These images, referred to as normal image pairs provide computational advantages because the rectification of normal image pairs need only be performed one time.

The method for rectifying the images is independent of the representation used for the given pose of the two cameras. It relies on the principal that any perspective projection is a projective projection. Image planes corresponding to the two cameras are replaced by image planes with the desired geometry (normal image pair) while keeping the geometry of the rays spanned by the points and the projection centers intact. This results in a planar projective transformation. These coefficients can also be computed at the factory.

Given the parameters computed in interior orientation, relative orientation and rectification, the camera images can be corrected for distortion and misalignment either in software or hardware. The resulting corrected images have the geometry of a normal image pair i.e., square pixels, aligned optical planes, aligned axes (rows), and pinhole camera model.

An exterior orientation process is also performed during factory set up of the exemplary embodiment. The exterior orientation process is needed because 3D points in a viewed scene are only known relative to the camera coordinate system. Exterior orientation determines the position and orientation of a camera in an absolute coordinate system. An absolute 3D coordinate system is established such that the XY plane corresponds to the ground plane and the origin is chosen to be an arbitrary point on the plane.

Ground plane calibration is performed at the location of the installation. In an embodiment, the primary sensor 230 and the secondary sensor 240 are mounted on a plane that is parallel to the floor, and the distance between the respective sensor and the floor is entered. Alternatively, calibration targets can be laid out in the floor to compute the relationship between the stereo coordinate system attached to the reference camera and the world or scene coordinates system attached to the ground plane.

Regions of interest are also set up manually at the location of the installation. This involves capturing the image from the reference camera (camera that the stereo coordinate system is tied to), rectifying it, displaying it and then using a graphics overlay tool to specify the zones to be monitored. Multiple zones can be pre-selected to allow for different run-time algorithms to run in each of the zones. The multiple zones typically include particular 3D spaces of interest. Filtering is performed to eliminate features outside of the zones being monitored, i.e., the primary zone 210. In alternative embodiments of the invention, automatic setup can be performed by laying out fiducial markings or tape on the floor.

While there are several methods to perform stereo vision to monitor each of the primary zone 210 and the secondary zone 220 according to the present invention, one such method is outlined below with reference to FIG. 6. This method detects features in a 3D scene using primarily boundary points or edges (due to occlusion and reflectance) because the information is most reliable only at these points. One skilled in the art will appreciate that the following method can be performed by each of the primary sensor 230 and the secondary sensor 240 simultaneously and independently. By the manner in which each of the respective sensors are independently coupled to the controller 310, it is not necessary for both primary and secondary sensors to communicate directly with each other.

Referring to FIG. 6, a set of two dimensional images are provided, e.g., a right image and a left image. One of the images is designated the reference image. Both of the images are rectified at step 610. Each respective rectification step is performed by applying an image rectification transform that corrects for alignment and lens distortion, resulting in virtually coplanar images. Rectification can be performed by using standard image rectification transforms known in the art. In an exemplary embodiment, the image rectification transform is implemented as a lookup table through which pixels of a raw image are transformed into pixels of a rectified image.

At 620, the rectified two-dimensional image points from the reference image (X_(R), Y_(R)) are matched to corresponding two-dimensional image points in the non-reference image (X_(L), Y_(L)). By first rectifying the images, reference image points (X_(R), Y_(R)) are matched to non-reference image points (X_(L), Y_(L)) along the same row, or epipolar line. Matching can be performed through known techniques in the art, such as in T. Kanade et al, A Stereo Machine for Video-rate Dense Depth Mapping and its New Applications, Proc. IEEE Computer Vision and Pattern Recognition (CVPR), pp. 196-202 (1996), the entire contents of which are incorporated herein by reference.

At 630, a set of disparities D corresponding to the matched image points is computed relative to the reference image points (X_(R), Y_(R)), resulting in a disparity map (X_(R), Y_(R), D), also called the depth map or the depth image. The disparity map contains a corresponding disparity ‘d’ for each reference image point (X_(R), Y_(R)). By rectifying the images, each disparity ‘d’ corresponds to a shift in the x-direction.

At 640, a three dimensional model of the door scene is generated in 3D world coordinates. In one embodiment, the three dimensional scene is first generated in 3D camera coordinates (X_(c), Y_(c), Z_(c)) from the disparity map (X_(R), Y_(R), D) and intrinsic parameters of the reference camera geometry. The 3D camera coordinates (X_(c), Y_(c), Z_(c)) for each image point are then converted into 3D world coordinates (X_(w), Y_(w), Z_(w)) by applying an appropriate coordinate system transform.

At 650, the target volume, i.e., the volume of space directly above the observed zone, can be dynamically adjusted and image points outside the target volume are clipped. The 3D world coordinates of the mantrap scene (X_(w), Y_(w), Z_(w)) that fall outside the 3D world coordinates of target volume are clipped. In a particular embodiment, clipping can be effectively performed by setting the disparity value ‘d’ to zero for each image points (X_(R), Y_(R)) whose corresponding 3D world coordinates fall outside the target volume, resulting in a filtered disparity map “filtered (X_(R), Y_(R), D)”. A disparity value that is equal to zero is considered invalid. The filtered disparity map is provided as input to a multi-resolution people segmentation process commencing at 660.

At 660, coarse segmentation is performed for identifying people candidates within the target volume. According to one embodiment, coarse segmentation includes generating a topological profile of the target volume from a low resolution view of the filtered disparity map. Peaks within the topological profile are identified as potential people candidates. A particular embodiment for performing coarse segmentation is illustrated in FIGS. 9 and 10.

At 670, fine segmentation is performed for confirming or discarding people candidates identified during course segmentation. According to one embodiment, the filtered disparity map is analyzed within localized areas at full resolution. The localized areas correspond to the locations of the people candidates identified during the coarse segmentation process. In particular, the fine segmentation process attempts to detect head and shoulder profiles within three dimensional volumes generated from the localized areas of the disparity map. A particular embodiment for performing fine segmentation is illustrated in FIGS. 11 through 13.

Coarse Segmentation of People Candidates

FIGS. 9 and 10 are diagrams illustrating a coarse segmentation process that identifies coarse people candidates according to one embodiment. In particular, FIG. 9 is a flow diagram illustrating a coarse segmentation process that identifies coarse people candidates according to one embodiment. The detected locations of the coarse people candidates resulting from the segmentation process are then forwarded to a fine segmentation process for validation or discard.

At 700, the filtered disparity map is segmented into bins. For example, in FIG. 10, the filtered disparity map 755 includes points (X_(R), Y_(R), D) which are segmented into bins 752, such that each bin contains a set of image points (X_(BIN), Y_(BIN)) and their corresponding disparities (D_(BIN)).

At 701 of FIG. 9, a low resolution disparity map is generated from calculated mean disparity values of the bins. For example, in FIG. 10, a low resolution disparity map 760 is generated including points (X_(M), Y_(M), D_(M)) where the points (X_(M), Y_(M)) correspond to bin locations in the high resolution disparity map 755 and D_(M) corresponds to the mean disparity values d_(M) calculated from those bins.

In a particular embodiment, a mean disparity value d_(M) for a particular bin can be calculated by generating a histogram of all of the disparities D_(BIN) in the bin having points (X_(BIN), Y_(BIN)). Excluding the bin points in which the disparities are equal to zero and thus invalid, a normalized mean disparity value d_(M) is calculated. The normalized mean disparity d_(M) is assigned to a point in the low resolution disparity map for that bin.

At 702 of FIG. 9, peaks are identified in the topological profile of the low resolution disparity map. In a particular embodiment, a peak is identified at a location in the low resolution disparity map having the largest value for mean disparity value d_(M). The extent of the peak is determined by traversing points in every direction, checking the disparity values at each point, and stopping in a direction when the disparity values start to rise. After determining the extent of the first peak, the process repeats for any remaining points in the low resolution map that have not been traversed.

For example, in FIG. 10, peak locations are identified at (x_(M1), y_(M1)) and (x_(M2), y_(M2)) of the low resolution disparity map 760 having mean disparity values d_(M1), d_(M2). The arrows extending from the peak locations illustrate the paths traversed from the peak locations. A watershed algorithm can be implemented for performing the traversal routine.

Alternatively, pixels in the disparity map having at least 3×3 neighborhoods can be determine to be relatively flat regions, that can be considered peak locations.

At 703 of FIG. 9, each of the peak locations are converted to approximate head location in the high resolution filtered disparity map. For example, in FIG. 10, peak locations (x_(M1), y_(M1)) and (x_(M2), y_(M2)) in the low resolution disparity map 760 are converted into locations (x_(R1), y_(R1)) and (x_(R2), y_(R2)) in the high resolution disparity map 755. This conversion can be accomplished by multiplying the peak locations by the number and size of the bins in the corresponding x-or y-direction.

At 704 of FIG. 9, the locations of the coarse people candidates (e.g., (x_(R1), y_(R1)) and (x_(R2), y_(R2))) in the filtered disparity map and the mean disparity values d_(M1), d_(M2) of the corresponding peak locations are forwarded to a fine segmentation process for validating or discarding these locations as people candidates, as in FIG. 11.

Fine Segmentation of People Candidates

FIGS. 11, 12, and 13 are diagrams illustrating a fine segmentation process for validating or discarding coarse people candidates according to one embodiment. In particular, FIG. 11 is a flow diagram illustrating fine segmentation process for validating or discarding coarse people candidates according to one embodiment. In particular, the fine segmentation process obtains more accurate, or fine, locations of the coarse people candidates in the filtered disparity map and then determines whether the coarse people candidates have the characteristic head/shoulder profiles from localized analysis of the high resolution filtered disparity map. Depending on the results, the fine segmentation process either validates or discards the people candidates.

At 800, a two dimensional head template is generated having a size relative to the disparity of one of the coarse candidates. Disparity corresponds indirectly to height such that as disparity increases, the distance from the camera decreases, and thus the height of the person increases. For example, FIG. 12 is a block diagram of an exemplary head template according to one embodiment. In the illustrated embodiment, the template model 870 includes a head template 875. The head template 875 is a circular model that corresponds to the top view of a head.

The dimensions of the head template 875 are based on the coarse location of the candidate (e.g., x_(R1), y_(R1)), the mean disparity value (e.g., d_(M1)), and known dimensions of a standard head (e.g. 20 cm in diameter, 10 cm in radius). For example, to compute the dimensions of the head template, the position of the head is computed in 3D world coordinates (X, Y, Z) from the calculated coarse location and a mean disparity value using the factory data (e.g., intrinsic parameters of camera geometry) and field calibration data (e.g., camera to world coordinate system transform). Next, consider another point in the world coordinate system which is (X+10 cm, Y, Z) and compute the position of the point in the rectified image space (e.g., x_(R2), y_(R2)) which is the image space in which all the image coordinates are maintained. The length of the vector defined by (x_(R1), y_(R1)) and (x_(R2), y_(R2)) corresponds to the radius of the circular model for the head template 875.

Furthermore, each point within the area of the resulting head template 875 is assigned the mean disparity value (e.g., d_(M1)) determined for that candidate. Points outside the head template 875 are assigned an invalid disparity value equal to zero.

At 810 of FIG. 11, a fine location for the candidate is determined through template matching. For example, in the illustrated embodiment of FIG. 13, the template model 870 overlays the filter disparity map 755 at an initial position corresponding to the coarse head location (e.g., x_(R1), y_(R1)). The disparities of the filtered disparity map 755 that fall within the head template 875 are then subtracted from the mean disparity value for the coarse people candidate (e.g., d_(M1)). A sum of the absolute values of these differences is then computed as a template score that serves as a relative indication of whether the underlying points of the filtered disparity map correspond to a head. Other correlation techniques may also be implemented to generate the template score.

The template matching is repeated, for example, by positioning the template 870 to other areas such that the center of the head template 875 corresponds to locations about the original coarse location of the candidate (e.g., x_(R1), y_(R1)). A fine location for the candidate (x_(F1), y_(F1)) is obtained from the position of the head template 875 at which the best template score was obtained.

At 820, another mean disparity value d_(F1) is computed from the points of the filtered disparity map within the head template 875 centered at the fine candidate location (x_(F1), y_(F1)). In a particular embodiment, the mean disparity value d_(F1) can be calculated by generating a histogram of all the disparities of the filtered disparity map that fall within the head template. Excluding the points in which the disparities are equal to zero and thus invalid, the normalized mean disparity value d_(F1) is calculated.

At 830, people candidates are discarded for lack of coverage by analyzing the disparities that fall within the head template which is fixed at the fine head location. For example, it is known that disparity corresponds to the height of an object. Thus, a histogram of a person's head is expected to have a distribution, or coverage, of disparities that is centered at a particular disparity tapering downward. If the resulting histogram generated at 820 does not conform to such a distribution, it is likely that the candidate is not a person and the candidate is discarded for lack of coverage.

At 840, the process determines whether there are more coarse candidates to process. If so, the process returns to 800 to analyze the next candidate. Otherwise, the process continues at 850.

At 850, people candidates having head locations that overlap with head locations of other people candidates are discarded. In a particular embodiment, the head locations of all of the people candidates are converted from the filtered disparity map into their corresponding 3D world coordinates. People candidates whose head locations overlap with the head locations of other people candidates result in at least one of the candidates being discarded. Preferably, the candidate corresponding to a shorter head location is discarded, because the candidate likely corresponds to a neck, shoulder, or other object other than a person.

At 860, the one or more resulting fine head locations (e.g., x_(F1), y_(F1)) of the validated people candidates and the corresponding mean disparity values (e.g., d_(F1)) are forwarded for further processing to determine if the number of people in the observed zone can be determined, at step 652.

Confidence Level Scoring of the Fuzzy Scoring Module

FIG. 14 is a flow diagram illustrating augmenting people candidates by confidence level scoring according to one embodiment. The input to the scoring algorithm includes the list of validated people candidates and their locations in the filtered disparity map. In particular, the input can be a data structure (e.g., array or linked list data structure) in which the size of the data structure corresponds to the number of validated people candidates.

If, at 900, the number of validated people candidates is equal to one or more persons, a confidence score F1 can be generated at 910. The confidence score F1 corresponds to a confidence level that the target volume contains only one person. The confidence score F1 can be a value between 0 and 1.

If, at 920, the number of validated people candidates is equal to two or more persons, a confidence score F2 can be generated at 930. The confidence score F2 corresponds to a confidence level that the target volume contains two or more persons. The confidence score F2 can be a value between 0 and 1.

At 940, a confidence score F0 can be generated regardless of the number of validated people candidates. The confidence score F0 corresponds to a confidence level that the target volume contains at least one person. The confidence score F0 can be a value between 0 and 1.

At 950, 960, and 970 respectively, the confidence scores F0, F1, and F2 are each averaged with confidence scores from previous frames, resulting in average confidence scores F0 _(AVG), F1 _(AVG) and F2 _(AVG). In a preferred embodiment, the confidence scores F0, F1, F2 are weighted according to weights assigned to each frame. The weights are intended to filter out confidence scores generated from frames giving spurious results.

At 980, the average confidence scores F0 _(AVG), F1 _(AVG) and F2 _(AVG) are used to determine the number of people present (or absent) in the target volume.

Referring back to FIG. 6, the primary sensor 230 and the secondary sensor 240 according to the exemplary embodiment considers the confidence scores from step 980, to make a determination about the number of people candidates in the respective primary zone 210 and secondary zone 220, and a confidence level of that determination, as shown at decision step 652. If the confidence that such a determination can be made, when interfaced to the controller 310 using discrete I/O, a signal can be asserted to the controller 310 to indicate if no people are present, one person is present, or greater than one person is present at step 672. If the confidence level is not sufficient to make such a determination, a signal is asserted to indicate that the sensor is “not ready”,at step 662.

At step 652, motion analysis between frames is used for the purpose of asserting a “not ready” signal, i.e., that the respective sensor does not have an ambiguous result, and can determine the number of people in the observed zone. In an illustrative embodiment, motion detection is performed using an orthographic projection histogram of 3D points on the floor. Each point in the histogram is weighted such that the closer the point is to the sensor, the less it contributes to the histogram value following the square law. A point twice as far away contributes four times as much resulting in a normalized count. The sum of absolute differences is computed for the current frame and several frames earlier, using a ring buffer. If the difference is excessive, motion is sufficient to suggest that the observed scene is not at a steady state to report a result. One skilled in the art will appreciate that other methods of motion detection an/or tracking objects between frames can be performed to determine a steady state sufficient to report a result. A sequence of such views and statistics for a duration (determined by the size of the ring buffer) is used to determine if the system “ready/not ready” signal can be asserted so that the number (or absence) of people in the observed zone can be determined.

The exemplary embodiment of the present invention can be implemented using the CPS-1000 PeopleSensor available from Cognex Corporation, Natick, Mass. for both the primary sensor 230 and the secondary sensor 240.

While the exemplary embodiment describes an implementation of the present invention in a basic rectangular mantrap, the invention can also be applied to large mantrap implementations and complex geometrical shaped mantraps. The secondary sensor can accommodate a large or an irregularly shaped secondary zone, through the use of a plurality of secondary sensors with the respective outputs logically combined (i.e., “ORed”). FIG. 8 depicts an exemplary arrangement of a plurality of secondary sensors in an “L” shaped mantrap 105. Referring to FIG. 8, the primary sesor 230 is mounted to observe the primary zone 210 in front of the airside door 110. The secondary zone is split into two regions, each with a secondary sensor. The first secondary zone 221 is observed by a first secondary sensor 241. The second secondary zone 222 is observed by a second secondary sensor 242. As shown in FIG. 8, the first secondary zone 221 can overlap the second secondary zone 222 to ensure complete coverage. One skilled in the art will appreciate that a plurality of secondary sensors can be adapted to provide complete coverage of a secondary zone of a mantrap that is shaped in an irregular pattern, or where regions of the mantrap secondary zone would be obscured from view of a single secondary sensor due to internal walls and/or partitions.

In an alternate embodiment of the present invention, additional image analysis can be performed to provide increased levels of security. The primary and secondary sensors in the exemplary embodiment analyze a three-dimensional space for features associated with objects or people in the respective zones. As described above, each of the sensors performs volume filtering to consider only those features that are detected in the 3D space above the respective primary zone 210 or secondary zone 220. The additional image analysis of the alternate embodiment will detect a person lying down, or attempting to bring foreign objects into the secure area.

A flowchart of the operation of the additional image analysis of the alternate embodiment is shown in FIG. 7. During operation, the three-dimensional space is analyzed according to the methods described above. At step 710, if there are no people or objects detected, e.g., the signal asserted by step 672 of FIG. 6 corresponds to no people or objects present, processing continues to step 720 where a comparison of a two-dimensional image is made to a baseline image 725.

An initial baseline image is provided during an initial setup configuration. To collect the initial baseline image, a plurality of images of the scene are acquired and statistics about the variation of each pixel are computed. If the variance of the intensity of a pixel is too high, it is added into a mask image so that it is not considered by subsequent processing. For example, a video monitor mounted within the mantrap will appear to be constantly changing appearance, and therefore, can be masked from consideration so that it does not falsely indicate the presence of a person or object in the region during operation. The computed statistics can also be used to set threshold levels used to determine changes that are significant and those that are not.

The comparison step 720 compares the current two-dimensional rectified image (from steps 610 and 612 of FIG. 6) to the baseline image. If a pixel in the current image significantly differs in value from the baseline, it is noted. These differing points can be clustered together and if the resulting clusters have sufficient size, it would suggest that a foreign object is in the mantrap. The clustering can be performed using conventional blob image analysis.

At step 730, if a significant difference is not detected, processing continues to step 740 where the baseline image is updated so that the comparison step 720 does not become susceptible to gradual changes in appearance. At step 740, the baseline image 725 is linearly combined with the current image compared at step 720. Processing then continues for another of a continuous cycle of analysis.

At step 730, if a significant difference is detected, processing continues to step 735 where a significantly differing pixel increments a timestamp count. At step 745, if the timestamp count exceeds a threshold, the baseline pixel is updated at step 740. This threshold could be user settable, allowing the user to decide how fast differences in the appearance of the mantrap get blended into the baseline image. By setting the threshold long enough, the dynamic baseline can be rendered essentially static. At step 750, a signal is asserted to indicate to the controller that a person or object is detected, and processing continues for another of a continuous cycle of analysis.

Optionally, for an even higher level of security, one might cluster the pixels being updated, and if there are sufficient numbers and areas, a security guard might be notified with an image of the new baseline image.

If most of the pixels are different than the dynamic baseline, it could signify a drastic lighting change. This could be caused by something like a light burning out. In this case one could automatically reselect image exposure parameters, run 3-d processing, reselect a dynamic 2D baseline and/or notify a security guard about the change.

When a person seeking entry into the secured region and enters the mantrap, the primary zone must be masked out of the image in addition to the regions of high pixel value variance. When someone is exiting the secured region through the mantrap, the entire space (both primary and secondary zones) can be examined to make sure that the area is clear and no one is attempting an ambush.

In a second alternative embodiment of the present invention, both the primary sensor 230 and the secondary sensor 240 are a single three-dimensional machine vision sensor configured to observe both the primary zone and the secondary zone at the same time, or in rapid succession.

In yet another alternative embodiment of the present invention, the secondary sensor 240 is a presence/absence detector, or a series of presence/absence detectors. In this embodiment, for example, the secondary sensor can be a pressure-sensitive mat that outputs a signal indicating that a person or object is standing or resting on the mat. Alternatively, the presence/absence detector can be one or more light beam emitter/detector pairs that outputs a signal indicating that a person or object blocks the light emissions directed from the emitter to the detector.

Alternatively, the presence/absence detector can be an IR sensor that outputs a signal indicating that motion of a person or object is detected in the secondary zone. Further, one skilled in the art will appreciate that the secondary sensor according to the present invention can be any of a combination of various types of presence/absence detectors that can be logically combined to output a signal indicating that a person or object exists in the secondary zone.

Although various calibration methods are described herein in terms of exemplary embodiments of the invention, persons having ordinary skill in the art should appreciate that any number of calibration methods can be used without departing from the spirit and scope of the invention. Although the exemplary embodiment described herein is setup in the factory using factory setup procedures, persons having ordinary skill in the art should appreciate that any of the described setup steps can also be performed in the field without departing from the scope of the invention.

Although an interior orientation process for determining the internal geometry of cameras in terms of the camera constant, the image center, radial distortion coefficients and aspect ratio, persons having ordinary skill in the art should appreciate that additional intrinsic parameters may be added or some of these parameters ignored in alternative embodiments within the scope of the present invention.

Although ground plane calibration in the exemplary embodiments described herein is performed at the location of installation, persons having ordinary skill in the art should appreciate that ground plane calibration could also be performed in the factory or at alternate locations without departing from the spirit and scope of the invention.

Although the invention is described herein in terms of a two camera stereo vision system, persons skilled in the art should appreciate that a single camera can be used to take two or more images from different locations to provide stereo images within the scope of the invention. For example, a camera could take separate images from a plurality of locations. Alternatively, a plurality of optical components could be arranged to provide a plurality of consecutive views to a stationary camera for use as stereo images according to the invention. Such optical components include reflective optical components, for example, mirrors, and refractive optical components, for example, lenses.

Although exemplary embodiments of the present invention are described in terms of filtering objects having predetermined heights above the ground plain, persons having ordinary skill in the art should appreciate that a stereo vision system according to the present invention could also filter objects at a predetermined distance from any arbitrary plain such as a wall, without departing from the spirit or scope of the invention. 

1. A method of controlling access to a secured area using a mantrap, the mantrap having a landside door and an airside door, the method comprising: monitoring a primary zone, the primary zone comprising a region within the mantrap having an area less than the area of the mantrap, to determine the presence of one person in the primary zone; monitoring a secondary zone, the secondary zone being an area comprising a region of the mantrap not including the primary zone, to determine the absence of any persons in the secondary zone; and controlling access through the landside door and the airside door in response to the steps of monitoring the primary zone and monitoring the secondary zone.
 2. The method according to claim 1 wherein the step monitoring the primary zone further comprises: acquiring a stereo image of the primary zone; computing a first set of 3D features from the stereo image of the primary zone; and determining the presence of one person in the primary zone using the first set of 3D features.
 3. The method according to claim 2 wherein the step of monitoring the secondary zone further comprises: acquiring a stereo image of the secondary zone; computing a set of 3D features from the stereo image of the secondary zone; and determining the absence of any person in the secondary zone using the second set of 3D features.
 4. The method according to claim 1 further comprising setting an alarm signal if the step of monitoring the primary zone fails to determine the presence of one person in the primary zone.
 5. The method according to claim 4 further comprising setting an alarm signal if the step of monitoring the secondary zone fails to determine the absence of any persons in the secondary zone.
 6. The method according to claim 2 further comprising filtering the first set of 3D features to exclude features that are computed to be substantially near the ground in the primary zone.
 7. The method according to claim 3 further comprising filtering the second set of 3D features to exclude features that are computed to be substantially near the ground in the secondary zone.
 8. The method according to claim 1 wherein both the step of monitoring the primary zone and monitoring the secondary zone are performed by a single three-dimensional machine vision sensor.
 9. A system for controlling access to a secured area using a mantrap, the system comprising: a mantrap having a lockable landside door and a lockable airside door; a primary sensor to detect the presence of a person in a primary zone within the mantrap, the primary zone comprising a region within the mantrap having an area less than the area of the mantrap; a secondary sensor to detect the absence of any persons within a secondary zone within the mantrap, the secondary zone comprising a region within the mantrap not including the primary zone; a controller coupled to the primary sensor and the secondary sensor, the controller actuating the lockable landside door and the lockable airside door in response to the output of the primary sensor and the secondary sensor.
 10. The system according to claim 9 wherein the primary sensor is a three-dimensional machine vision sensor adapted to monitor a first volume of space directly above the primary zone.
 11. The system according to claim 10 wherein the secondary sensor is a three-dimensional machine vision sensor adapted to monitor a second volume of space directly above the secondary zone.
 12. The system according to claim 10 wherein the secondary sensor comprises a plurality of three-dimensional machine vision sensors, the plurality of three dimensional machine vision sensors adapted to cooperatively monitor a second volume of space directly above the secondary zone, and wherein the controller is cooperatively coupled to each of the plurality of three dimensional machine vision sensors.
 13. The system according to claim 10 wherein the secondary sensor is a presence/absence detector.
 14. The system according to claim 13 wherein the presence/absence detector is a sensor selected from the list consisting of a pressure-sensitive mat, a light beam emitter/detector pair, and an infra-red motion sensor.
 15. A method for detecting objects in a mantrap, the method comprising: acquiring a stereo image of a region of the mantrap, the stereo image comprising a plurality of two-dimensional images of the region; computing a set of 3D features from the stereo image; determining the absence of any person in the region using the set of 3D features; comparing one of the plurality of two dimensional images of the region to a baseline image; and detecting an object in the mantrap from the step of comparing.
 16. The method according to claim 15 wherein the baseline image is computed from a plurality of images of the region of the mantrap when no known objects are present.
 17. The method according to claim 15 further comprising combining the baseline image with at least one of the plurality of two-dimensional images of the region if no objects are detected. 